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ABSTRACT 


Tanglegrams are graphs consisting of two rooted binary plane trees with the same 
number of leaves and a perfect matching between the two leaf sets. A Tanglegram 
drawing is a special way of drawing a Tanglegram; and a Tanglegram is called planar 
if it has a drawing such that the matching edges do not cross. In this thesis, we 
will discuss various results related to the construction and planarity of Tanglegrams, 
as well as demonstrate how to construct all the Tanglegrams of size 4 by looking 
at two types of rooted binary trees - Caterpillar and Complete Binary Trees. After 
augmenting a Tanglegram with an edge between its roots, we will prove that the 
Tanglegram crossing number of the original Tanglegram is greater than or equal to 
the crossing number of the augmented Tanglegram taken as a graph. We will show 
that the removal of a matching edge from a Tanglegram of size n > 3 decreases the 
Tanglegram crossing number by at most n — 3, and give a family of 1-edge panar 
Tanglegrams (one for every n > 3) of size n with Tangle crossing number n — 3, 
showing that the previous statement is sharp. We will also discuss various conditions 


on the nonplanarity of Tanglegrams. 
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CHAPTER 1 


INTRODUCTION 


In this thesis, we begin in Chapter 2 by introducing the definitions needed to establish 
elementary results that are necessary to facilitate future discussions on Tanglegrams. 
We mention several results related to trees, due to their connection in the construction 
of Tanglegrams. At times, we make distinctions between multiple forms of notation 
as they relate to how it will be presented in this thesis. 

In Chapter 3, we first focus on building our definition of Tanglegrams by talking 
about both complete binary and caterpillar trees, make a distinction between unla- 
belled and labelled Tanglegrams, and various operations that can be performed on 
Tanglegrams. Then, our discussion is broken several parts, all of which illustrate 
how to construct Tanglegrams of size 4. First, we consider the type of left and right 
rooted binary trees that will be the bases of our Tanglegrams. Then, we focus on 
determining all the matchings possible between the leaves of the two rooted binary 
trees. We also provide many illustrations of the process throughout the chapter. 

In Chapter 4, we discuss the planarity of Tanglegrams and rely on the idea of 
the Tanglegram crossing number to generalize the planarity of Tanglegrams. Then, 
we focus on proving several results related to Tanglegrams and the augmentation of 
these graphs with an edge connecting the roots of the two rooted binary trees. Our 
discussion concludes with proving a Theorem relates the Tanglegram crossing number 
of a Tanglegram to a Tanglegram with one less matching edge. 

Finally, we conclude this thesis in Chapter 5 by briefly touching on notable prop- 


erties of induced Subtanglegrams, and in particular, extend the prior planarity argu- 


ments to these types of graphs. 


CHAPTER 2 


ELEMENTARY GRAPH NOTIONS 


To begin discussing anything of importance to Tanglegrams, we need to state quite 


a few central definitions. 


Definition 1. For a set A and a non-negative integer k, ) denotes the collection of 


k-element subsets of A. 


Definition 2. A graph G = (V, E) consists of a set V of vertices and a set E of edges 
where EC (2) We use the notation e = ry for e = {z,y}, where e € E and 
x,y € V; we assume that V F 0. 

Definition 3. We define the order of G to be the number of vertices of G; denoted by 
|V| or |V(G)|. Additionally, we define the size of G to be the number of edges of G; 


denoted by || or |E(G)|. 


In this thesis, we make no distinction between |V| and |V(G)| unless otherwise 
stated. Similarly for |F| or |E(G)|. Also, all graphs discussed in this thesis are simple 
finite graphs in that they have at most one edge between any two vertices in V. Note 
that our definition does not allow loops, which are edges whose two endpoints are the 


same. 


Definition 4. We say a vertex v is incident with an edge e if v is an endpoint of e. 
Similarly, v is adjacent to another vertex u if v and u are connected by an edge; u 


and v are called neighbors. 


Definition 5. The complete graph on n vertices, denoted Ky, is the graph where all 


vertices are pairwise adjacent; equivalently, it is the graph where every vertex is 


connected to every other vertex by an edge. 


Complete graphs are of particular importance in this thesis, as they will allow us 
to generalize the planarity of graphs. Figure 2.1 depicts an illustration of K3, Ka, 


and K;, the complete graphs on 3,4, and 5 vertices, respectively. 


Figure 2.1 The complete graphs 3 (top left), A4 (top right), and Ks (bottom). 


Definition 6. Let G = (V, FE) and G’ = (V’, E’) both be graphs such that V’ C V 
and E’ C E. Then, we say G’ is a subgraph of G, and write it as G’ CG. If G’ CG 
but G’ 4G, then G’ is a proper subgraph of G. 

Definition 7. Let G = (V, E) and bea graph. Then, an induced subgraph G’ = (V’, E’) 
of G is a subgraph formed from a subset V’ C V and all of the edges of F (i.e. 
ES Ep) Ce) that connect pairs of vertices in that subset. Furthermore, a spanning 


subgraph of G is a subgraph that contains all the vertices of the original graph. 


We now quickly mention some important notation revolving around the removal 


of one or more vertices or edges of a graph. 


Notation 1. For a graph G and a vertex v € V(G), G — v denotes the graph induced 
by V(G) — {v}. Similarly, let A = {v1,v2,...,Un} C V(G) such that v; € V for all 
1<i<n. We denote the graph G with the removal of all the vertices contained in 
A by G — A, where the order of the removal of the vertices of A does not matter. 
Additionally, for G = (V, FE) andv €V, G+v=(VU {v}, E). 

Notation 2. For a graph G and an edge e € E(G), G — e denotes the graph with 
vertex set V(G) and edge set E(G) — {e}. Similarly, let B = {e1, €2,...,en} C E(G) 
such that e; € E for all 1 < 7 <n. We denote the graph G with the removal of all 
the edges contained in B by G — B, where the order of the removal of the vertices of 


B does not matter. Similar notation follows for the addition of a edge e: G+ e. 
Now, back to some key definitions needed to elaborate on future results. 


Definition 8. The degree of a vertex v in a graph G, denoted by degg(v), is the 
number of edges incident to v; i.e. the number of edges that have v as an endpoint. 


We define the numbers 6(G) and A(G) as follows. 


6(G) = min{deg,(v) | v € V(G)} 
A(G) = max{dege¢(v) | v € V(G)} 


Definition 9. Let G = (E,V) be a graph. We define a path on G to be a sequence of 
k +1 distinct vertices vgv;...vg (where k € N such that for each i € [k], u;-10; € FE). 
Note, vp and vz are the endvertices of the path, and the length of the path is the 


number of edges on the path, which is k. 


Definition 10. Let G = (E,V) be a graph. We define a cycle on G to be a sequence 
of vertices vjv2...Uzv, such that v,...vz is a path P and u,v, is an edge of the graph 
that is not on the path. Note that this implies that k is at least 3. The length of the 


cycle is the number of edges on the cycle, which is k. 


Definition 11. A graph G is connected if there is a path between any two vertices in 


G. 


Now that we have defined what is meant by a connected graph, we can discuss 
some important terms and concepts related to induced subgraphs and minimal span- 


ning subgraphs. 


Definition 12. Let G = (V,E) be a graph. A component of G is the maximal con- 


nected subgraph of G. Note that this means components have disjoint vertex sets. 


We claim that the components of a graph are induced subgraphs. So, consider a 
component H of G. By definition it is a subgraph, its vertex set is some U CV. Let 
U1, U2 € U such that uju, € FE. If uyug ¢ H, then H’ = H + u,uz is a subgraph of G, 
still connected (as any two vertices of H’ has a path between then even in H), and 
H' contains H as a proper subgraph, contradicting that H was a maximal connected 
subgraph. Thus all edges connecting vertices in U are edges of H, showing that H is 


induced. 


Figure 2.2. A minimal connected spanning subgraph of Ks. 


We will “cycle” back to graphs in a moment, but now it is time to throw away 


cycles to begin with and discuss cycle-free graphs. 


Definition 13. A tree T = (V, E) is a connected graph that is acyclic, or cycle-free. 
All the vertices with degree at most one are called leaves. Any other vertex in T is 
called an internal vertex. The order of T, denoted |T|, is the number of vertices of 


T. Note that the vertex in the single vertex tree a leaf. 


Theorem 1. For a graph T on n vertices the following are equivalent: 


1. T is a tree. 
2. There is a unique path between any two vertices of T. 


3. T is minimally connected; i.e. T is connected but T — e is not for all edges 


ee T(E). 


4. T is maximally acyclic; i.e. T contains no cycles but T + e does fore = xy, 


where x and y are any two non-adjacent vertices in T. 
5. T is connected and has n —1 edges. 


6. T is acyclic and has n — 1 edges. 


Most of the proofs for the equivalences above can be found in [5]. We will show 
that latter result - if J’ is a tree with n vertices, then T’ has n — 1 edges, simply 


because the proof is a standard exercise. 
Proposition 2. [fT is a tree with n vertices, then T has n — 1 edges. 


Proof. Let T be a tree such that |Z] =n. We will work by induction on n. 
Suppose |7| = 1. Then, T is a singleton point with no edges and the result holds. 
Suppose |7| = 2. Then, T has two vertices that must be joined by an edge to 
adhere to 7 being a connected graph. As this path is unique, we can only have 1 


edge in our graph. So, the result holds. 


Now suppose the result holds for all |7'| =n and let T’ = (V’, E’) be a tree with 
n+ 1 vertices. We claim that J’ has a leaf. Take a longest path in a tree with n > 2 
vertices. The endvertices of this path have degree 1. Otherwise you have either a 
cycle, or you can lengthen you path, contradicting the choice of longest path. 

So, let v be a leaf of T’, wu € V(Z") be a internal vertex, and e = vu be the unique 
edge connecting v to u. By removing the leaf v and edge e, we have a resulting graph 
G’ that is still connected, but with one less vertex: |G’| = (n+1)—1 =n. Then, by the 


induction hypothesis, G’ has n—1 edges. Thus, we have that |E(T")| = (n—1)+1 =n, 


and so T” is a tree with n vertices and n — 1 edges. By induction, we are finished. 


In this thesis, we need to specify a type of tree that will help us form future 


graphs. The particular type of tree we care to discuss is called a binary tree. 


Definition 14. A rooted tree is a tree with a vertex specified as a root. If T is a rooted 
tree with root r, then for any non-root vertex y, the parent of y is the neighbor of y 
on the unique r — y path in 7’; the root has no parent. If y is any vertex of T, then 


the children of y are those neighbors of y that are not the parent of y. 


Note that this means that all neighbors of the root are the children of the root, 


and leaves have no children. 


Definition 15. A rooted binary tree L is a rooted tree where any non-leaf vertex has 
precisely two children. Any non-leaf vertices are internal vertices. The size of L is 
the number of leaves L has. Note that “size” here differs from “size” in Definition 3, 
as we know exactly the number of edges in trees, we rarely speak about the number 
of edges in them in the coming discussion. A cherry of L are two leaves with the 


same parent. 
Definition 16. A graph G = (V, E) is bipartite, denoted Kinn, if V admits a partition 
into 2 classes, or sets A and B with |A| = m and |B| = n, such that every edge 


has its ends in each class and every 2 vertices of G in the same class must not be 


adjacent. A bipartite graph is complete if each pair of vertices from different classes 
are adjacent. Figure 2.3 gives three different drawings of v3.3, the complete bipartite 


graph on two classes each containing 3 vertices. 


Figure 2.3 Three different drawings of the bipartite graph K’33. - Fig 1.6.2 from [5]. 


Definition 17. If G = (V, E) is a graph, and e = uv is an edge of G, then G/e (G 
contracted on the edge e) is the graph with vertex set (V \ {u,v}) U {way} and edge 
set {xy: cy € E, {x,y} {u,v} =O}U{ew, : 2 € V — {u,v}, (cu € E or av € E)}. 
In other words, we remove the edge uv, identify the vertices u,v (the resulting vertex 


is called w,,) and remove any duplicates from the resulting edges. 


Definition 18. A minor of a graph G is a graph that can be obtained from a subgraph 


by edge contractions. 


Definition 19. A subdivision of a graph G is a graph obtained by replacing edges of 
G with new paths of length at least one connecting the endpoints of the former edge. 
The new vertices introduced with the paths differ from the old vertices of G and new 
vertices introduced for different paths are distinct. A subdivision never decreases the 


number of vertices. Figure 2.5 illustrates this concept. 


Definition 20. A planar drawing is a drawing where the vertices of the graph are 


represented by different points in the plane and edges are represented by simple 


LD LD 


Figure 2.4 Contracting the edge e of G to obtain G’. This results in G’ being a 


minor of G. 


G G2 


Figure 2.5 The graph G2 is a subdivision of the graph Gj. 


curves connecting their endpoints (i.e. the points representing their endvertices) such 
that the interior of these curves are disjoint from the set of points that are vertices 
of the graph and from each other. A planar graph is any graph that has a planar 


drawing. 


Definition 21. A graph G is planar if it can be drawn in a way such that no edges 


intersect each other except at their endpoints. 


We illustrate the concept of subdividing a graph in Figure 2.5. This idea is crucial 


to understanding the planarity generalization embedded in Kuratowski’s Theorem. 


Theorem 3 (Kuratowski’s Theorem, |7]). A graph is nonplanar if and only if it 


contains a subdivision of K33 or Ks. 
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Definition 22. A matching of a graph is a subset MW of the edge set where each vertex 
has either one or zero edges incident to it in M. The matching M is perfect if every 
vertex is connected to exactly one edge in M. 

In Chapter 3, we explore the properties and characteristics of a particular type of 


graph called a Tanglegram. 
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CHAPTER 3 


INTRODUCTION TO ‘TANGLEGRAMS 


We now turn to a much deeper discussion on a peculiar type of graph known as a 
Tanglegram. Tanglegrams play an important role in phylogenetics, particularly in 
the theory of co-speciation. As we will see by its construction, the two rooted trees 
that form a Tanglegram represent the phylogenetic tree of hosts and the phylogenetic 
tree of their parasites. Although the results discussed in this thesis do not pertain to 
any significant biological discussion or applications, one could find themselves drawn 
to Tanglegrams simply for their inherit ability to illustrate the beauty of co-species 
relations; refer to [8] for more information. 

Recall that in Definiton 15, we defined what a rooted binary tree is. The binary 
tree drawn in Figure 3.1 is called a Caterpillar, but different drawings of a Caterpillar 


easy to make. We define the two types of importance below. 


Definition 23. A Rooted Caterpillar, denoted C,,, of size n is the unique rooted binary 
tree with n leaves such that there are two leaves of distance n — 1 from the root and 
for each i € {1,2,...,n — 2}, there is one leaf of distance i from the root. Figure 3.1 


gives a progression of Caterpillar trees up to size 4. 


Definition 24. A Complete Binary Tree is a rooted binary tree, with height k, where 
every leaf is at distance k from the root. All Complete Binary Trees have 2" leaves. 


Figure 3.2 gives a progression of the Complete Binary Trees up to height 3. 


By assigning a matching between the leaves of two arbitrary rooted binary trees 


of the same order, we can finally construct the Tanglegram. 
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AY» )y 


Figure 3.1 A progression of Caterpillar Trees of size 1 to 4. 


hf 


Figure 3.2. A progression of Complete Binary Trees of height 0 to 3. 


Definition 25. A Tanglegram (L,R,o) of size n is a graph consisting of an n-leaf left 
binary tree £ with a root r, an n-leaf right binary tree with a root p, and a perfect 
matching o between the leaves of £ and R. The size of a Tanglegram is the number 


of leaves in £ or FR. 


For much of graph theory, we do not care for labels of vertices on the graphs. The 
same follows for Tanglegrams. Some of the Tanglegrams depicted here have labels on 


their vertices and some do not. 
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Definition 26. Two Tanglegrams (L,, Ri, 01) and (Lz, R2, 02) are considered the same 
if there exists a bijection between the vertex sets of the two Tanglegram such that 


the following are preserved. 
e The left root maps to the left root. 
e The right root maps to the right root. 
e The bijection makes a graph isomorphism between the two Tanglegrams. 


If you consider size n Tanglegrams as a labeled graph, the bijection described 
in the previous definition defines an equivalence relation on these Tanglegrams, and 
the different equivalence classes of this relation can be considered as the different 
Tanglegrams. These Tanglegrams correspond to graphs where only the roots of the 


left tree and the right tree have labels (identifying which tree they belong to as roots). 
Definition 27. A Tanglegram Layout of (L,R,7o) is a straight line drawing such that: 


1. A left planar binary tree that is isomorphic to £, with a root r and tree drawn 


in the plane where x < 0, and whose leaves are on the line x = 0. 


2. A right planar binary tree that is isomorphic to R, with a root p and tree drawn 


in the plane where x > 1, and whose leaves are on the line x = 1. 
3. A perfect matching o between their leaves drawn in straight line segments. 


Definition 28. A switch on the Tanglegram Layout of a Tanglegram (L,R, 0c) is the 
following operation: select an internal vertex v of one of the two trees £L and R 
and change the order of its two children. Then, draw the subtrees rooted at the 
children the same way as they were drawn before. Figure 3.3 illustrates a switch on 


a Tanglegram of size 4 at a root p. 


To illustrate the difference between labeled and unlabeled Tanglegrams, consider 


the 2 Tanglegrams presented in Figure 3.4. 
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Original Layout After Switching at p 


Figure 3.3 The results of a switch operation. 


The left tree in both of them is a size 3 Tanglegram with root r, other internal 
vertex a, where r is adjacent to leaf ¢3, and a is adjacent to leaves @, and fy. Also, 
the right tree in both of them has root p, other internal vertex b, where p is adjacent 
to leaf m3, and 6 is adjacent to leaves m, and mz. 

The matching in the first one is {m 11, m22,m3l3} and in the second one is 
{myl, M201, m3l3}. They are different as labeled graphs, but the following isomor- 


phism shows that they are the same Tanglegram: 


v, ifv € {6, lo} 
fe= 44, ifv=h 
fo, v7 
The 16 pictures in the Figures 3.5 and 3.6 llustrate the different layouts of the 


same Tanglegram. You have 16 different layouts if you consider the Tanglegram as a 
labeled graph, but only 8 as a Tanglegram layout. 

We then show that there are 16 different labelled Tanglegram layouts of this 
labelled Tanglegram. As unlabelled and labelled Tanglegrams they are all the same, 
but these are different labelled Tanglegram drawings of the same labelled Tanglegram. 


Figures 3.5 and 3.6 show all 16 drawings. 
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Figure 3.4 These Tanglegrams are different as labeled Tanglegrams, but are the 
same as unlabelled Tanglegrams. 


Figure 3.7 gives a picture of a Tanglegram of size 4 using two Caterpillar trees as 
its two rooted binary trees. The perfect matching of its leaves is notated by dotted 
lines between its leaves. There happens to be an additional 12 distinct Tanglegrams 
of size 4, which is illustrated later in Figure 3.14. 

We can determine the distinct number of Tanglegrams of a particular size by 
adhering to only those drawings that do not result in a symmetry of any other. 
There is a unique binary tree with 1 leaf: the singleton vertex. We can obtain all 
rooted binary trees with n > 1 leaves by using the rooted binary trees with n — 1 
leaves and adding two children to one of their leaves. Now this means that there is 
only one Tanglegram of size 1: both the left- and right tree is a singleton vertex, and 
the matching connects these two vertices. 

So for 2 leaves, we take the singleton vertex and add two leaves to it (this is both 
a rooted Caterpillar and a rooted Complete Binary tree of height 1) - there is only 
one such tree. For Tanglegrams of size 2, the left and right trees must be equal to 
this unique tree, and there is only one way to match them (there are two ways if we 


consider them as labeled graphs, but they are the same Tanglegram). See Figure 3.8. 


For rooted binary trees on 3 leaves, there is one rooted (unlabeled) binary tree T’ 
on 2 leaves. As the two leaves are not distinguishable (a tree isomorphism takes one 
into the other) there is only one way to obtain a rooted binary tree on 3 -leaves by 


appending two children on a leaf of 7’. The resulting tree is a rooted Caterpillar; with 
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Pp 
b 
Pp 
Pp 
b 
Subset {p} Subset {p, a, b} 
by m4 by mg 
gl b a so b 
mg lo M4 
Ms - f3 M3 - 


Subset {a} Subset {b} 


Figure 3.5 The first 4 pairs of layouts from corresponding set X = {r,a, b,p}. 


Le 


b 
b 
Pp 
Pp 
b 
Subset {p, a} Subset {p, b} 
ms (3 ™3 
My a ma 
omg 5 =F a lo My 5 - 


Subset {r, a, p} Subset. {r, b, p} 


Figure 3.6 The second 4 pairs of layouts from corresponding set X = {r,a,b,p}. 
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——- 


Figure 3.7 A Tanglegram of size 4 with two C, graphs as the rooted binary trees. 
r a _ Pp 
Figure 3.8 A Tanglegram of size 2 with two rooted Complete Binary trees. 


two distinguishable leaves: the two leaves forming a cherry are not distinguishable 
from each other. Now the matching can do two things: the leafs connected to the 
root on the left- and right-tree can be matched to each other or not. The two possible 
size 3 Tanglegrams are discussed below. 

As both the left and right tree must be a caterpillar tree, they have exactly one 
leaf that is the child of the root, and two other leaves at distance two from the root. 
So, we have two cases: the leaves that are children of the two roots are matched to 
each other, or they are not. Both of these cases result in a unique Tanglegram, as 
illustrated in Figure 3.9. 

Now for size 4 tanglegrams. Note that the 4-leaf complete binary tree has two 
cherries; the 4-leaf caterpillar has a cherry (both of its leaves at distance 3 from the 
root), a leaf at distance one and another leaf at distance 2 from the root. If the two 


sides are both complete binary trees, we have two cases: 


e If one of the cherries on the left is matched to a cherry on the right, then the 
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ae 
Sontag 


Figure 3.9 The two different Tanglegrams of size 3. 


other cherry on the left must match to the other cherry on the right. This defines 


a unique (unlabeled) Tanglegram. 


e If the previous case does not happen, then both leaves in a cherry need to 
be matched to two leaves that do not form a cherry. This defines a unique 


(unlabeled) Tanglegram. 


If the left tree is a caterpillar and the right tree is a complete binary tree, then 


we have two cases: 


e The cherry in the caterpillar is matched to a cherry in the binary tree: this 


defines a unique Tanglegram. 


e The cherry in the caterpillar is not matched to a cherry in the binary tree: this 


defines a unique Tanglegram. 


If the left tree is a complete binary tree and the right tree is a caterpillar, the two 
cases are very similar as the above ones. 


Now, if both the left and right trees are caterpillars, we have the following cases: 
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e The cherry on the left is matched to the cherry on the right, in which case the 


following subcases occur. 


— The leaves at distance 1 from the root match to each other (and conse- 


quently so do leaves at distance 2 from the root). 


— The leaves at distance 1 from the root do not match to each other (and 
consequently a leaf at distance one from the root on one side matches to 


a leaf at distance two from the root on the other side). 


e The cherry on the left is matched to leaves that are not part of the cherry on 
the other side (and consequently the cherry on the right is matched to leaves 


that are not part of the cherry on the left: this defines a unique Tanglegram. 


e One leaf of the cherry on the left is matched to one leaf of the cherry on the 
right, but the other leaf of the cherry on the left is not matched to the cherry 
on the right. We have the following based on where the other leaf of the cherry 


matches to... 


— both the left and the right cherry matches to the leaf at distance one from 
the root on the other side (and consequently the leaves at distance two 


from the root match to each other). 


— both the left and the right cherry matches to the leaf at distance two from 
the root on the other side (and consequently the leaves at distance one 


from the root match to each other). 


— the left cherry leaf matches to the leaf at distance one from the root on 
the right and the right cherry matches to the leaf at distance two from the 


root on the left. 


— the left cherry leaf matches to the leaf at distance two from the root on 


the right and the right cherry matches to the leaf at distance one from the 


pal 


root on the left. 


Ti = (Es R, 01) T> — (L, R, 02) 


Figure 3.10 The 2 different Tanglegrams of size 4 with Complete Binary Trees as 
Land R 


Organizing all of our perfect matchings and figures, we conclude that there are13 
Tanglegrams of size 4, Figure 3.14 illustrates all 13 layouts. 


Now rooted binary trees with 5 leaves can be obtained from the following: 


e The Complete Binary tree of height 2 by adding 2 children to one of its leaves 
(as the leaves are indistinguishable, there is only 1 such tree, neither Caterpillar 


nor Complete Binary). 
e« The rooted Caterpillar with 4 leaves by adding 2 children to a ... 


— Leaf at distance 1 from the root (neither Caterpillar nor Complete Binary) 
— Leaf at distance 2 (neither Caterpillar nor Complete Binary 
— One of the cherry leaves (Caterpillar) 
We can continue in this manner to find all the Tanglegrams of size n, but one will find 
that this number grows quite quickly. In consideration to the enumeration problem 


for Tanglegrams, [4] obtained an explicit formula for T;, of Tanglegrams with n leaves 


on each side. The following asymptotic formula holds for the counting sequence 


Ze 


eS 


= (L,R,03) Ty = (£,R, 04) 
L> . 
= (Los) = (L,R, 06) 
= (L,R, 07) = (L,R, 09) 
=(£,R;69) 


Figure 3.11 The 7 different Tanglegrams of size 4 with Rooted Caterpillars as £ 
and FR. 
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Tio = (£, R, o10) Tu = (£,R, 011) 


Figure 3.12 The 2 different Tanglegrams of size 4 with Complete Binary Tree £ 
and Rooted Caterpillar R. 


T= (By O12) T3 = (G, R, O13) 


Figure 3.13 The 2 different Tanglegrams of size 4 with Rooted Caterpillar £ and 
Complete Binary Tree R. 


1,1, 2, 13, 114, 1509, 25595, 535753, 13305590, 382728552, . . . 


thanks to work done by [2]. We will end with determing the number of size 4 Tangle- 
grams and encourage the reader to allocate what time they would have spent drawing 
out all the Tanglegrams of size 5 into something much more productive. 

The next chapter generalizes the planarity argument of Kuratowski’s Theorem, 


and applies it to Tanglegrams. 
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2 LPP 


sh , 03) oe , 05) oe 04) 
$2 82S 
Saris 07) aoe =(LR.09) 
><> <> 
= (L, R, 08) Tin = (L,R, O12) T13 = (L, R, 013) 
> SoS 
Tio = (L,R, O10) Ty, = (£,R, 011) =(L, 2,07) 
o> 
=(L£,R, 03) 


Figure 3.14 The 13 tanglegrams of size 4. T, and Tg cannot be drawn without at 
least one crossing of the matching edges. 
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CHAPTER 4 


TANGLEGRAM CROSSING NUMBERS AND PLANARITY 


We have discussed how to draw Tanglegrams in the plane (i.e. what a Tanglegram 
layout is). This definition ensures that in a layout two edges can cross only if they 
both are matching edges, but, as Figure 3.6 illustrates, in the different layouts of a 


Tanglegram the amount of crossing that occur can be different. 


Definition 29. A graph drawing, D, is drawing of a graph where the vertices of the 
graph are represented by points and edges are represented by simple curves connecting 


their endpoints, and not going through any other vertices of the graph. 


Definition 30. For a drawing D and edges e, f, let crp(e, f) define the number of 
common interior points of e and f. Then, the crossing number of the drawing D is 
Gr. D = sS crp(e, f). 
fe,f}<(5) 
The crossing number of a graph, denoted cr(G), is the minimal crossing number of 


all of its drawings. 


Definition 31. The Tangelgram crossing number of a Tanglegram T is the minimal 


crossing number over all of its layouts. 


Definition 32. A planar graph is a graph that has a planar drawing, i.e. a drawing 
with crossing number 0. A plane graph is a planar graph together with a planar 
drawing (i.e. two different planar drawings of the same graph are different plane 


graphs) 
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Definition 33. A Tanglegram is planar if its Tanglegram crossing number is zero; 
in other words, if it has a layout without crossing matching edges. Otherwise, the 


tanglegram is called nonplanar. 


Let us explore some interesting characteristics of planarity and crossing numbers 


of graphs. 


Proposition 4. For a connected plane graph with v vertices, e edges and f faces, we 


have v —e+ f =2. 


We can construct familiar drawings of K; and K33 optimally with respect to their 
crossing number. Although we do not prove that the crossing numbers of Ks and 
K33 are not greater than 1, the graphs in 4.1 are drawn in such a way that Ks and 


K3.3 have only one crossing. 


Ks with cr(Ks5) =1 K33 with cr(K33) = 1 


Figure 4.1 Drawings of Ks; and K33 with their respective crossing numbers. The 
red dots mark the point where the edges cross. 


As both graphs K33 and Ks in Figure 4.1 are drawn with one crossing, their 
crossing number is at most 1. As they are known to be non-planar, their crossing 


number equals 1. Kuratowski’s theorem states that essentially these two graphs are 


Zt 


the obstacles of planarity, as any nonplanar graph must contain a subdivision of one 

of then. We will explore similar obstacles for the non-planarity of Tanglegrams. 
Using Theorem 3, we assert multiple important ideas. One, in particular, is if the 

graph crossing number of a Tanglegram augmented by an edge between the roots of 


its binary trees is nonzero, so is Tanglegram crossing number. 


Proposition 5 ({4]). Let T = (£L,R,a) be a Tanglegram and let the roots of £L and 
R ber and p, respectively. Let T* be the underlying graph of T augmented with an 


edge between r and p; see Figure 4.2. Then, crt(T) > cr(T*). 


Proof. Consider an optimal layout of the Tanglegram T with crt(7’) crossings. We 
can create a drawing D of 7* by drawing the edge between r and p in the optimal 


layout of T such that this edge creates no new crossings. Then we have 


Figure 4.2 Drawing of an augmented Tanglegram 7 with an augmented edge 
connecting roots r and p. 
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Proposition 6 ([4]). Let T = (£,R,a) be a Tanglegram layout and let the roots of £L 
and ‘R ber and p, respectively. Let T* be the underlying graph of T augmented with 


an edge between r and p; see Figure 4.2. Then, the following are equivalent: (comes 


from [4]) 
Ler 
Par yg (bags) eal 
3. T* contains a subdivision of K3.3 


Proof. Note that T* has maximum degree 3, so it can not contain a subdivision of 
Ks. This means that Kuratovski’s theorem implies that 2 and 3 are equivalent. 


1 = 2 can be done with standard techniques using topology. 


2 = 1 follows from Proposition 5. 


By the way we defined it in Definition 33, a Tanglegram is planar if its crossing 
number is zero. But from the view of ordinary crossing numbers, the size 4 Tangle- 
grams are simply a subdivision of the four vertex, 3-regular multi-graphs. 

Definition 34. The Tanglegram is k-edge planar, if for any M Co with |M| < k, the 
Tanglegram induced by o — M is not planar, but there is an M’ C o with |M’'| =k 
such that the Tanglegram induced by o — M’ is planar. 

Definition 35. A multi-graph is a graph where one or more of its vertices may be 
connected to any other vertices by more than one edge. This includes vertices being 


connected to itself via an edge, called a loop. 


We first note the following remark, which can be seen since Ky can be drawn 


without any crossings of the interior of its edges. 
Remark 1. The complete graph on 4 vertices, Ky, is planar. 


Any four vertex, 3-regular graph can be obtained by duplicating some edges of 


an appropriate subgraph of Ky, and the size 4 Tanglegrams are simply a subdivision 
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Figure 4.3 The extended Tanglegram T> is a subdivided K33, but without the 
extra edge it is a subdivided Ky, which is a planar graph. 


of the four vertex, 3-regular multi-graphs. If we view size 4 Tanglegrams as graphs, 
they have crossing number 0 (in other words, as graphs, they are planar). However, 
not all size 4 Tanglegrams are planar as Tanglegrams. In particular, the Tanglegrams 
Ty and Ty have crossing number 1, and we can use Proposition 5 to show that they 
are not planar. This is illustrated in Figures 4.3 and 4.4. 

For now, we illustrate that planarity can be judged by the notion of how many 
edges can be removed until the graph becomes planar. In particular, we show that for 
a Tanglegram of size n, when we remove any one of the matching edges and suppress 


the two leafs it connected, the crossing number of the original Tanglegram and the 
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Figure 4.4 The extended Tanglegram To is a subdivided K33, but without the 
extra edge it is a subdivided Ky, which is a planar graph. 


crossing number of the suppressed Tanglegram are related in a particularly nice way. 


Theorem 7 ({1]). Let T = (L,R,o) be a Tanglegram of size n > 3 and let e € o be 


a matching edge of T. Then, we have that 
crt(T) — cri(T —e) <n—-3. 
Consequently, in any optimal layout, any matching edge crosses at most n — 3 edges. 


Proof. We will work by induction on n. We will first verify the base case n = 3. 
We previously found all the Tanglegrams of size 3 in Chapter 3. They were all 


planar. Similarly, removing a matching edge, the Tanglegram still remains planar. 


dl 


So, 


ert(T) — ert(T—e)<n-3 
0-0<3-3 


0<0 


Clearly, in any optimal layout, the matching edges cross 0 = 3 — 3 other edges. 
Therefore, the theorem holds for n = 3. 


Now let n > 4 and suppose that for every Tanglegram of size n — 1 that 
ert(T’) — ert(T —e) <n—4. 


Fix some Tanglegram T = (£L,R,0) with size n. Let e € o be arbitrary. Let e = uv, 


where u € Land v € R. We fix an optimal layout T’ of T — e to be such that 
T’ = (Ly, Rv, o — €) 


with the fewest the number of crossings. 
Now let wg be the parent vertex of u and L’ be the subtree rooted at the second 
child of wz. Define wr in a similar manner. We have, as a result, two planar 


drawings of £ whose sub-drawings of £L,, agrees with the drawing of L,, in T’: 
1. One drawing with u immediately above the leaves of L’. 
2. One drawing with u immediately below the leaves of L’. 


Observe that the ordering of the leaves of £, in each drawing of £ is the same as 
in T’. Also, by performing a switch operation at wz, we can obtain one of these 
drawings of £. This is noted in R and FR’, and Figure 4.5 illustrates two potential 


positions of u and v in a drawing of 7’. 
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Figure 4.5 An illustration of the potential positions of u and v in the proof of 
Theorem 7. 


All that is left to show is that there is some drawing D of T that uses one of these 
two drawings of £ and R in which the matching edge e € o crosses at most n — 3 


edges. Then, we will have that 
ert(T) < ert(D) < ert(T — e) + (n— 8). 


Or, equivalently 
ert(T’) — ert(T —e) <n—-3. 


We have two cases to deal with: 
1. L’ and R’ each have exactly one leaf and they are matched in o — e or 
2. There is a leaf in £’ and a leaf in R’ which are not matched with one another. 


Case 1: Let e’ be the edge matching the single leaves in £’ and R’. By the induction 
hypothesis e’ crosses at most n — 4 edges in the layout. Let the drawing of T be with 
u above L' and v above R’ be such that e is parallel to e’. Then e crosses precisely 
those edges that e’ crosses, so e crosses at most n — 4 edges (See Figure 4.6 for an 


illustration of this case.) 
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Case 2: Let ue € L’ be a leaf and vz € R’ be a leaf such that uz and vp are not 


matched together. We define two more ideas below: 


1. We say a leaf is matched upward if the leaf to which it is connected is at least 


as high as the lowest leaf in the respective tree. 


2. We say a leaf is matched downward if the leaf to which it is connected is no 


higher than the highest leaf in the respective tree. 


Let e; and eg be matching edges with endpoints ug and vp, respectively. We have 


two subcases to consider: 


1. Let ug and vp be both matched upward (respectively, downward). Draw the 
vertex u below (respectively, above) in £’ and the vertex v below (respectively, 
below) in R’. Then, e does not cross e; or €2, and so, e crosses at most n — 3 
edges. Thus, 

ert(T’) — ert(T —e) <n—-3. 


2. Let ug be matched to a leaf higher (respectively, lower) than the leaves of R’ 
and let vg be matched to a leaf lower (respectively, higher) than the leaves of 
L'. Draw the vertex wu directly below (respectively, above) the leaves of £’ and 
uv directly above (respectively, below) the leaves of R’. Then, e does not cross 


€, Or €2, and so, e crosses at most n — 3 edges. Thus, 
ert(T’) — ert(T —e) <n—-3. 
Figure 4.6 illustrates both of these subcases. 


Now consider an optimal drawing of T with crt(T) many crossings. Take a match- 
ing edge e that crosses x other edges. The removal of e results in a subdrawing D of 
T —e with crt(T) — x crossings. Since crt(T’— e) < ert(D) =crt(T’) — x, we get that 


x < ert(T)—-crt(T —e) <n—-3. 


Therefore, we have considered all possibilities and the Theorem is proven. 
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Figure 4.6 An illustration of the possible relations between £’ and R’ in the proof 
of Theorem 7: (a) £’ and FR’ each have exactly one leaf and they are matched in 

a —e. (b) ug and ve are not matched to each other and are both matched upward. 
(c) ug is matched to a leaf higher than the leaves of R’, and vg is matched to a 
leaf lower than the leaves of L’. 


Definition 36. A Tanglegram T is k-edge planar if for any M Co with |M| < k, the 
Tanglegram induced by o — M is not planar, but there is an M’ C o with |M’|=k 


such that the Tanglegram induced by o — M’ is planar. 


Definition 37. For each n > 4, we define the Caterpillar Tanglegram P, = (L, R,c) 
as follows: LZ and R are copies of the rooted caterpillar C,,. We label the leaves of L 
as u;, where 7 is the leafs distance from the root. Since there are precisely two leaves 
at distance n — 1, we arbitrarily label one of these wu, instead. Similarly, the leaves 


of R are labeled using v;. Finally, we construct o, = {uiUn_; | i € [n — 1] }U Undn. 


Theorem 8 ([1]). For each n > 4, Caterpillar Tanglegram P,, is 1-edge planar and 


has a crossing number of n — 3. 


Proof. Observe in Figure 8 that crt(P,) <n — 8, and that the removal of the unvp 
edge results in a planar Tanglegram. As for n > 4, or, n — 3 > 1, the rest of the 
statement is proved if we show that crt(P,,) > — 3. We prove this by induction on 


nN. 
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Figure 4.7 The Caterpillar Tangelgram Pg. 


We have shown earlier that crt(P,) = 1 (refer back to Figure 4.4), so the statement 
is true for n = 1. Let n > 4 and assume that crt(P,) = n — 3. Consider P,41. Since 
P, is an induced subtanglegram of P,,, we have that crt(P,41) > 1. 

Consider an optimal layout of P,., with crt(P,41) > 1 many crossings. D must 
contain a crossing pair of edges, so one of these edges is of the form u;v,_; for some 
i € [n—1]. The removal of ujv,_; from P,41 gives a copy of P, (and an induced 
layout D of P, from our optimal layout of P,41). Since u;vp_; crossed at least one 


edge, we have that n — 3 < crt(P,) < cr(D) < crt(Pa4i) — 1, which gives that 


n—2=(n+4+1)-—3 < crt(Py11). Thus, the result is acheived. 
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CHAPTER 5 


PROPERTIES OF INDUCED SUBTANGLEGRAMS 


Recall that an induced subgraph G’ = (V’, E’) of a graph G = (V,£) is a graph 
consisting of vertex subset V’ C V and edge set E’ = E. An induced subtree T’ acts 
similarly on a tree 7’. We will focus primarily on induced binary subtrees, which is 
utilized quite often in the study of phylogenetics. 

In a rooted plane binary tree Br with a root r, we say a subset L of the leaves of 
Br induces another rooted binary tree by taking the smallest subtree containing the 
leaves in L, designating the vertex rz of this subtree closest to the old root as the 
new root, and finally suppressing all vertices of degree 2 other than rz. See Figure 
5.1 for an illustration of this process. 


r 


Te bylzla Te 


> 


f, by by ly £ bh t3 4 


Figure 5.1 A rooted binary tree with root r, four leaves £L = {0,, C2, £3, £4} selected, 
the vertex 1, 2,0,e, and the tree induced by the selected leaves. 


Consider a layout of a Tanglegram T = (£L,R,o) with roots r and p of £ and R, 


respectively. Let a’ C o be a subset of the set of matching edges. The leaf sets of o’ 
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induce a left and right induced binary plane tree, which, after putting back the edges 


of o between the corresponding leaves, define a layout of a Tanglegram 7”. 


Definition 38. We call T’ the induced subtanglegram of a Tanglegram T; T” is induced 


by the matching edge set o’ (see Figure 5.2). 


Bees: Wes 
nce re g1e2es SA he 
r = 
e1e2€3 €3 P €5 


Figure 5.2 A Tanglegram T with matching edges o’ = {e1, €2, e3} selected, the 
vertices Teese, ANA Pe ,e9,e,, and the subtanglegram induced by the selected edges. 


We claim that given either a planar or non-planar drawing of a Tanglegram T’ 
of size n > 4, that we can always find an induced subtanglegram T” such that T” is 
planar. In particular, we assert that the largest possible 7’ that is planar will always 


be of size n — crt(T). 


Theorem 9. Let T = (£L,R,c) be a Tanglegram of sizen > 4. Then, there exists an 


induced subtanglegram T' = (L',R’,o') of T such that 

|T’| >n — ert(T). 
Consequently, if T is k-edge-planar, then k < crt(T). 
Proof. Let T be a Tanglegram of size n > 4. We consider cases: 


1. If T is planar, then any induced subtanglegram J” of T is planar, by the defi- 


nition of how 7” is constructed. So, the crt(Z7’) = 0 and further, 
IT’]} =n>n—-0=n. 
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2. If T is nonplanar, then keep deleting matching edges until TJ’ has a crossing 
number of zero and is planar. Since T is nonplanar, there was at least one 
crossing of its matching edges, so any resulting subtanglegram T” is such that 


|T’| > n —crt(T). 


We have shown that the removal of at most crt(7’) many edges from T results in 


the planar Tanglegram. As by definition, we need to remove at most k edges from a 


k-edge-planar Tanglegram, k <crt(T). 


Figure 5.3 Red and blue nodes represent the two partitions of a subdivided K3.3 
with all vertices on the left side. 


Recall that Proposition 7 states that every non-planar Tanglegram, the augmented 
graph 7* contains a subdivision of 33. So, we state the next theorem which was 


proved in [4]. 


Theorem 10 ({4]). Every non-planar Tanglegram contains T2 or Ty as an induced 


Subtanglegram. 


Now observe that Theorem 11 is stronger than the statement of Proposition 7, 


as it provides a subdivided K33 such that three vertices of the K3.3 lie an the left- 
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subtree, and the other three are in the right subtree. As Figure 5.3 shows, if we find a 
subdivided /33 in the augmented graph of a non-planar Tanglegram, this subdivided 
k33 does not have to correspond to any induced subtanglegram. So it can be seen 


that Theorem 11 gives more structure than Proposition 7. 
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